Architecting a Large-scale Elastic Environment - Recontextualization and Adaptive Cloud Services for Scientific Computing

نویسندگان

  • Paul Marshall
  • Henry M. Tufo
  • Katarzyna Keahey
  • David La Bissoniere
  • Matthew Woitaszek
چکیده

Infrastructure-as-a-service (IaaS) clouds, such as Amazon EC2, offer pay-for-use virtual resources ondemand. This allows users to outsource computation and storage when needed and create elastic computing environments that adapt to changing demand. However, existing services, such as cluster resource managers (e.g. Torque), do not include support for elastic environments. Furthermore, no recontextualization services exist to reconfigure these environments as they continually adapt to changes in demand. In this paper we present an architecture for a large-scale elastic cluster environment. We extend an open-source elastic IaaS manager, the Elastic Processing Unit (EPU), to support the Torque batch-queue scheduler. We also develop a lightweight REST-based recontextualization broker that periodically reconfigures the cluster as nodes join or leave the environment. Our solution adds nodes dynamically at runtime and supports MPI jobs across distributed resources. For experimental evaluation, we deploy our solution using both NSF FutureGrid and Amazon EC2. We demonstrate the ability of our solution to create multi-cloud deployments and run batchqueued jobs, recontextualize 256 node clusters within one second of the recontextualization period, and scale to over 475 nodes in less than 15 minutes.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Optimal Utilization of Cloud Resources using Adaptive Back Propagation Neural Network and Multi-Level Priority Queue Scheduling

With the innovation of cloud computing industry lots of services were provided based on different deployment criteria. Nowadays everyone tries to remain connected and demand maximum utilization of resources with minimum timeand effort. Thus, making it an important challenge in cloud computing for optimum utilization of resources. To overcome this issue, many techniques have been proposed ...

متن کامل

A review of methods for resource allocation and operational framework in cloud computing

The issue of management and allocation of resources in cloud computing environments, according to the breadth of scale and modern technology implementation, is a complicated issue. Issues such as: the heterogeneity of resources, resource dependencies to each other, the dynamics of the environment, virtualization, workload diversity as well as a wide range of management objectives of cloud servi...

متن کامل

A Model based on Cloud Computing for the implementation and management IT services in Banks

In recent years, the banking industry has made significant changes in technology and communications. The expansion of electronic communications and a large number of people around the world access to the Internet, appropriate to establish trade and economic exchanges provided but high costs, lack of flexibility and agility in existing systems because of the large volume of information, confiden...

متن کامل

Data Replication-Based Scheduling in Cloud Computing Environment

Abstract— High-performance computing and vast storage are two key factors required for executing data-intensive applications. In comparison with traditional distributed systems like data grid, cloud computing provides these factors in a more affordable, scalable and elastic platform. Furthermore, accessing data files is critical for performing such applications. Sometimes accessing data becomes...

متن کامل

A Model based on Cloud Computing for the implementation and management IT services in Banks

In recent years, the banking industry has made significant changes in technology and communications. The expansion of electronic communications and a large number of people around the world access to the Internet, appropriate to establish trade and economic exchanges provided but high costs, lack of flexibility and agility in existing systems because of the large volume of information, confiden...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012